25 research outputs found

    BLAST-EXPLORER helps you building datasets for phylogenetic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task.</p> <p>Results</p> <p>To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the Phylogeny.fr platform.</p> <p>Conclusions</p> <p>BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at <url>http://www.phylogeny.fr</url></p

    Plankton networks driving carbon export in the oligotrophic ocean

    Get PDF
    The biological carbon pump is the process by which CO 2 is transformed to organic carbon via photosynthesis, exported through sinking particles, and finally sequestered in the deep ocean. While the intensity of the pump correlates with plankton community composition, the underlying ecosystem structure driving the process remains largely uncharacterized. Here we use environmental and metagenomic data gathered during the Tara Oceans expedition to improve our understanding of carbon export in the oligotrophic ocean. We show that specific plankton communities, from the surface and deep chlorophyll maximum, correlate with carbon export at 150 m and highlight unexpected taxa such as Radiolaria and alveolate parasites, as well as Synechococcus and their phages, as lineages most strongly associated with carbon export in the subtropical, nutrient-depleted, oligotrophic ocean. Additionally, we show that the relative abundance of a few bacterial and viral genes can predict a significant fraction of the variability in carbon export in these regions

    Total V9 rDNA information organized at the metabarcode level for the Tara Oceans Expedition (2009-2012)

    No full text
    The present data set provides a tab separated text file compressed in a zip archive. The file includes metadata for each TaraOceans V9 rDNA metabarcode including the following fields:md5sum = unique identifier; lineage = taxonomic path associated to the metabarcode; pid = % identity to the closest reference barcode from V9_PR2; sequence = nucleotide sequence of the metabarcode; refs = identity of the best hit reference sequence(s); TARA_xxx = number of occurrences of this barcode in each of the 334 samples; totab = total abundance of the barcode ; cid = identifier of the OTU to which the barcode belongs; and taxogroup = high-taxonomic level assignation of this barcode. The file also includes three categories of functional annotations: (1) Chloroplast: yes, presence of permanent chloroplast; no, absence of permanent chloroplast ; NA, undetermined. (2) Symbiont (small partner): parasite, the species is a parasite; commensal, the species is a commensal; mutualist, the species is a mutualist symbiont, most often a microalgal taxon involved in photosymbiosis; no the species is not involved in a symbiosis as small partner; NA, undetermined. (3) Symbiont (host): photo, the host species relies on a mutualistic microalgal photosymbiont to survive (obligatory photosymbiosis); photo_falc, same as photo, but facultative relationship; photo_klep, the host species maintains chloroplasts from microalgal prey(s) to survive; photo_klep_falc, same as photo_klep, but facultative; Nfix, the host species must interact with a mutualistic symbiont providing N2 fixation to survive; Nfix_falc, same as Nfix, but facultative; no, the species is not involved in any mutualistic symbioses; NA, undetermined. For example, the collodarian/Brandtodinium symbiosis is annotated: Chloroplast, "no"; Symbiont (small), "no"; Symbiont (host), "photo", for the collodarian host; and: Chloroplast, "yes"; Symbiont (small), "mutualist"; Symbiont (host), "no", for the dinoflagellate microalgal endosymbiont.chloroplast = "yes", "no" or "NA"; symbiont.small = "parasite", "commensal", "mutualist", "no" or "NA"; symbiont.host = "photo", "photo_falc", "photo_klep", "Nfix", no or NA; benef = "Nfix", "no" or "NA"; trophism = Metazoa , heterotroph , NA , photosymbiosis , phototroph according to the previous fields

    Total V9 rDNA information organized at the OTU level for the Tara Oceans Expedition (2009-2012)

    No full text
    The present data set provides a tab separated text file compressed in a zip archive. The file includes metadata for each TaraOceans V9 rDNA OTU including the following fields: md5sum = identifier of the representative (most abundant) sequence of the swarm; cid = identifier of the OTU; totab = total abundance of barcodes in this OTU; TARA_xxx = number of occurrences of barcodes in this OTU in each of the 334 samples;rtotab = total abundance of the representative barcode; pid = percentage identity of the representative barcode to the closest reference sequence from V9_PR2; lineage = taxonomic path assigned to the representative barcode ; refs = best hit reference sequence(s) with respect to the representative barcode ; taxogroup = high-taxonomic level assignation of the representative barcode. The file also includes three categories of functional annotations: (1) Chloroplast: yes, presence of permanent chloroplast; no, absence of permanent chloroplast ; NA, undetermined. (2) Symbiont (small partner): parasite, the species is a parasite; commensal, the species is a commensal; mutualist, the species is a mutualist symbiont, most often a microalgal taxon involved in photosymbiosis; no the species is not involved in a symbiosis as small partner; NA, undetermined. (3) Symbiont (host): photo, the host species relies on a mutualistic microalgal photosymbiont to survive (obligatory photosymbiosis); photo_falc, same as photo, but facultative relationship; photo_klep, the host species maintains chloroplasts from microalgal prey(s) to survive; photo_klep_falc, same as photo_klep, but facultative; Nfix, the host species must interact with a mutualistic symbiont providing N2 fixation to survive; Nfix_falc, same as Nfix, but facultative; no, the species is not involved in any mutualistic symbioses; NA, undetermined

    Abundance of metabardodes and OTUs, and contextual data of samples selected for a study of the biogeography and diversity of Collodaria (Radiolaria) in the global ocean

    No full text
    The present data set provides context to 653 samples (including 4 size fractions, 0.8-5 µm, 5-20 µm, 20-180 µm and 180-2000 µm) collected in the [SRF] surface water layer (ENVO:00010504) and the [DCM] deep chlorophyll maximum layer (ENVO:01000326) at 113 sampling stations during the Tara Oceans expedition. The present data set also provides links to the corresponding nucleotides data at the European Nucleotides Archive and the abundance of metabarcodes and OTUs for Rhizaria and Collodaria from the 113 sampling stations. Additional context can be found in the related publications and source data sets

    Environmental context of selected samples from the Tara Oceans Expedition (2009-2013)

    No full text
    The present data set provides contextual environmental data for samples from the Tara Oceans Expedition (2009-2013) that were selected for publication in a special issue of the SCIENCE journal (see related references below). The data set provides calculated averages of mesaurements made at the sampling location and depth, calculated averages from climatologies (AMODIS, VGPM) and satellite products

    RAxML tree inferred from the alignment of 165 representative V9 sequences of leptocylindracean OTUs from the BioMarKs data, six leptocylindracean sequences from GenBank, and 96 reference sequences of bolidomonads, leptocylindraceans and other diatoms, utilizing the GTRGAMMA base substitution model and Hill Climbing algorithm.

    No full text
    <p><i>Bolidomonas pacifica</i> and <i>B. mediterranea</i> were designated as outgroups. All non-leptocylindracean sequences were pruned from the tree following tree construction (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0103810#pone.0103810.s002" target="_blank">Figure S2</a> for tree with outgroups included). Bootstrap values were inferred from 100 distinct alternative runs and values <50 are deleted. OTU labels follow same principle as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0103810#pone-0103810-g001" target="_blank">Figure 1</a>.</p

    Venn diagrams showing the number of site-specific and shared OTUs among the six sampling stations.

    No full text
    <p>(A) V4 at Naples, Oslo Fjord, Gijon and Blanes (B) V9 at Varna, Oslo Fjord, Naples and Roscoff (C) V9 at Oslo Fjord, Gijon, Blanes and Naples. Venn diagrams for V9 have been split into two figures to compare OTU distribution among the sequence-abundant stations Naples and Oslo and the other four stations.</p

    RAxML tree inferred from the alignment of 12 representative V4 sequences of leptocylindracean OTUs from the BioMarKs data, 46 leptocylindracean sequences from GenBank, and 134 reference sequences of bolidomonads, Leptocylindraceae and other diatoms, utilizing the GTRGAMMA base substitution model and Hill Climbing algorithm.

    No full text
    <p><i>Bolidomonas pacifica</i> and <i>B. mediterranea</i> were designated as outgroups. All non-leptocylindracean reference sequences were pruned from the tree following tree construction (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0103810#pone.0103810.s001" target="_blank">Figure S1</a> for tree with outgroups included). Bootstrap values were inferred from 100 distinct alternative runs and values <50 are deleted. Each OTU is labelled as follows: the first letter denotes the first letter of the genus, the second letter, the first one of the species; the number denotes the cluster number (numbering starts from zero); the number after the underscore denotes the abundance of the OTU.</p
    corecore